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ABSTRACT 

We review a range of stastistical methods for analyzing the structures of star 
clusters, and derive a new measure Q which both quantifies, and distinguishes between, 
a (relatively smooth) large-scale radial density gradient and multi-scale (fractal) sub- 
clustering. 

The distribution of separations p(s) is considered, and the Normalised Correlation 
Length s (i.e. the mean separation between stars, divided by the overall radius of the 
cluster), is shown to be a robust indicator of the extent to which a smooth cluster is 
centrally concentrated. For spherical clusters having volume-density n oc r~ a (with 
a between and 2) s decreases monotonically with a, from ~ 0.8 to ~ 0.6. Since s 
reflects all star positions, it implicitly incorporates edge effects. However, for fractal 
star clusters (with fractal dimension D between 1.5 and 3) s decreases monotonically 
with D (from ~ 0.8 to ~ 0.6). Hence s, on its own, can quantify, but cannot distinguish 
between, a smooth large-scale radial density gradient and multi-scale (fractal) sub- 
clustering. 

The Minimal Spanning Tree (MST) is then considered, and it is shown that the 
Normalised Mean Edge Length m (i.e. the mean length of the branches of the tree, 
divided by (A/totai^4) 1//2 /(A/totai — 1), where A is the area of the cluster and A/total is the 
number of stars), can also quantify, but again cannot on its own distinguish between, 
a smooth large-scale radial density gradient and multi-scale (fractal) sub-clustering. 

However, the combination Q — rh/s does both quantify, and distinguish between, 
a smooth large-scale radial density gradient and multi-scale (fractal) sub-clustering. 
IC348 has Q = 0.98 and p Ophiuchus has Q — 0.85, implying that both are centrally 
concentrated clusters with, respectively, a ~ 2.2 ± 0.2 and a ~ 1.2 ± 0.3 . Chamaeleon 
and IC2391 have Q = 0.67 and Q = 0.66 respectively, implying mild substructure with 
a notional fractal dimension D ~ 2.25 ± 0.25. Taurus has even more sub-structure, 
with Q — 0.45 implying D' ~ 1.55 ± 0.25. If the binaries in Taurus are treated as 
single systems, Q increases to 0.58 and D' increases to 1.9 ± 0.2. 

Key words: open clusters and associations: general 



1 INTRODUCTION 

Since most stars are formed in clusters, it would be useful 
to have quantitative and objective statistical measures of 
their structure, with a view to comparing clusters formed 
in different environments, and tracking changes in structure 
as clusters evolve. This is particularly important for young, 
embedded clusters, where the structure may yield impor- 
tant clues to the formation process but is changing rapidly. 
It is also important for comparing observed clusters with 
numerical simulations. 

At present, we do not have sufficiently robust statis- 
tical measures for this purpose. Features which are easily 
identified by the human eye, such as sub-clusters, or lin- 
ear features, can be strangely elusive to objective statistical 



analysis. For example, it is difficult to distinguish, statisti- 
cally, between a degree of fractal or random sub-clustering, 
and the existence of a density gradient (Bate, Clarke & Mc- 
Caughrean 1997). This paper explores some possible mea- 
sures, and evaluates their usefulness. In particular, we find a 
robust objective measure which both quantifies, and distin- 
guishes between, a smooth large-scale radial density gradient 
and multi-scale (fractal) sub-clustering. 

In Section 2 we describe our methodology. In Section 
3 we look again at the Mean Surface Density of Compan- 
ions (MSDC), a tool pioneered by Larson (1995) and subse- 
quently used by several others (e.g. Simon 1995; Bate, Clarke 
and McCaughrean 1997; Nakajima et al. 1998; Brandner & 
Kohler 1998; Gladwin et al. 1999; Klessen & Kroupa 2001). 
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We focus on measures which reflect the clustering regime 
(wide separations) rather than the binary regime (close sep- 
arations). In Section 4 we explore the use of the Minimal 
Spanning Tree (Barrow, Bhavasar & Sonoda 1985) and its 
derivatives. In Section 5 we combine the MSDC and the 
MST to derive a single measure Q which is able both to 
quantify, and to distinguish between, a smooth radial den- 
sity gradient and multi-scale (fractal) sub-clustering. 

All the measures are tested and calibrated on multiple 
realizations of artificial star clusters, and applied to p Ophi- 
uchus, Chamaeleon, Taurus, IC348 and IC2391. Our results 
are discussed in Section 6, and the main conclusions arc 
summarized in Section 7. 



2 METHODOLOGY 

Three different types of artificial star cluster have been cre- 
ated, using random numbers 1Z to generate the individual 
star positions. The first type (2Da) are circular clusters 
(i.e. two-dimensional disc-like clusters) with surface density 
N oc r~ a and a = or 1. The second type (3Da) are spher- 
ical clusters (i.e. three-dimensional clusters) having volume 
density n oc r~ a with a = 0, 1, 2, and 2.9. The third type 
(FD) are fractal star clusters (again three dimensional) with 
fractal dimension D — 3.0, 2.5, 2.0, or 1.5. 

The different types are listed in Column 1 of Table 1. 
All of the artificial clusters are created with 100 to 300 stars, 
as the numbers of stars within the five real clusters lie within 
that range. The data for the five real clusters used are illus- 
trated in Appendix A, and the sources listed in Table 2. 

A cluster of type 2Dq is created by positioning the stars 
according to 



x 
V 



{(2~a)lZr/2} 1/(2 - a> 

r cos(cA) , 
rsin(<A) . 



(1) 



where lZ r and TZ^ are random numbers in the range 0-1. 

A cluster of type 3Da is created by positioning the stars 
according to 
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V 
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{(3~a)1l r /3} 1/ia - a) 
cos' 1 (2TZ e - 1) , 
2tt^, 

r sin(0) cos(cft) , 
r sin(0) sin(^>) , 
r cos(#) . 



(2) 



where TZ r , IZg and 1Z$ are random numbers in the range 0-1. 
Clearly this method cannot be used for a = 3, so to have a 
cluster type approximating to a — 3 we use a — 2.9. 

A cluster of type FD is created by defining an ur-cube 
with side 2, and placing an ur-parent at the centre of the 
ur-cube. Next, the ur-cube is divided into Af^ iv equal sub- 
cubes, and a child is placed at the centre of each sub-cube 
(the first generation). Normally we use Afdw = 2, in which 
case there are 8 sub-cubes and 8 first-generation children. 
The probability that a child matures to become a parent 
in its own right is Affj? , where D is the fractal dimen- 
sion. For lower D, the probability that a child matures to 
become a parent is lower, and the cluster is more 'porous'. 



Table 1. Clustering measures obtained for artificial and real star 
clusters. Column 1 lists the cluster type (for artificial clusters) 
or name (for real clusters). Column 2 gives the Normalized Cor- 
relation Length s (i.e. the ratio of the mean separation to the 
cluster radius, see Section 3). Column 3 gives the Normalised 
Mean Edge Length m (see Section 4). Column 4 gives the mean 
value of the standard deviation of the edge length, a m . Column 
5 gives Q = rh/s. For the artificial star clusters, means and stan- 
dard deviations are computed from 100 realisations of each type, 
with 100 < Mtotal < 300. 

Cluster type 



or name 


s 




m 






Q 


2D0(iVocr°) 


.88 ± 


03 


.65 ±0.02 


.31 ± 


02 


.74 ± .02 


2Dl(iVocr- 1 ) 


.70 ± 


03 


.61 ± .02 


.38 ± 


02 


.85 ±.03 


3D2.9(ntxr- 2 ' 9 ) 


.16 ± 


02 


.24 ± .05 


.59 ± 


07 


1.50 ±.13 


3D2(nocr- 2 ) 


.60 ± 


03 


.55 ± .02 


.41 ± 


03 


.93 ±.03 


SDl^nocr- 1 ) 


.73 ± 


03 


.61 ± .02 


.33 ± 


03 


.84 ±.02 


3D0(nocr°) 


.80 ± 


02 


.63 ± .02 


.31 ± 


02 


.79 ±.02 


F3.0(D = 3.0) 


.81 ± 


03 


.64 ± .02 


.30 ± 


02 


.80 ±.02 


F2.5(D = 2.5) 


.74 ± 


09 


.54 ± .05 


.28 ± 


03 


.73 ± .06 


F2.0(D = 2.0) 


.67 ± 


13 


.41 ± .04 


.28 ± 


02 


.61 ± .08 


F1.5(D = 1.5) 


.62 ± 


18 


.27 ± .07 


.35 ± 


07 


.45 ± .09 


IC2391 


0.74 


.49 


.30 




.66 


Chamaeleon 


0.63 


.42 


.45 




.67 


Taurus 


0.55 


.26 


.56 




.47 


p Ophiuchus 


0.53 


.45 


.39 




.85 


IC348 


0.49 


.48 


.46 




.98 



Children who do not mature are deleted, along with the ur- 
parent. A little noise is then added to the positions of the 
remaining children, to avoid an obviously regular structure, 
and they then become the parents of the next generation, 
each one spawning A/d; v children (the second generation) at 
the centres of A/",f iv equal- volume sub-sub-cubes, and with 
each second-generation child having a probability A/^ -3 ' 
of maturing to become a parent. This process is repeated 
recursively until there is a sufficiently large generation that, 
even after pruning to impose a spherically symmetric en- 
velope of radius 1 within the ur-cube, there are still more 
children than the required number of stars. Children are 
then culled randomly until the required number is left, and 
the surviving children are identified with the stars of the 
cluster. At each generation, the survival of a child is deter- 
mined by generating a random number 1Z in (0, 1); survival 
then requires that 1Z < A/"jf 3 ^ . 

Clusters of type 2Da are investigated for two purposes. 
First, we wish to clarify the effect of a sharply defined 
circular edge on an otherwise statistically uniform, two- 
dimensional distribution of stars. Clusters of type 2D0 en- 
able us to isolate this effect. Second, we wish to explore how 
readily two-dimensional and three-dimensional distributions 
can be distinguished. This could be important if stars arc 
being formed in layers, for example at a shock front. 

For each type of artificial cluster, 100 realisations are 
analysed, so that means and standard deviations can be 
obtained for the parameters extracted. Three-dimensional 
clusters (types 3Dtv and FD) are projected onto an arbi- 
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Figure 1. Distribution function p(s) for separations between randomly chosen stars in artificial (non-fractal) cluster of type (a) 2D0, 
N oc r°; (b) 2D1, N oc r _1 ; (c) 3D0, n oc r°; (d) 3D1, n oc r _1 ; (c) 3D2, n oc r~ 2 ; and (f) 3D2.9, n oc r~ 2 ' 9 . The solid line is the value of 
p(s) for a star cluster of type 2D0 having an infinite number of stars (Eqn. 4), and is included for reference. The dashed line is p(s) = 2s 
(see text), s is normalized to the overall radius of the cluster, as described in the text. 



trary plane prior to analysis. Two-dimensional clusters are 
viewed face- on. 

3 THE MEAN SURFACE DENSITY OF 
COMPANIONS 

3.1 Log-log plots and edge effects 

A widely used tool for analysing the structure of star clus- 
ters is the log/log plot of the mean surface-density of com- 



panions, N against separation, s. This tool has been pio- 
neered by Larson (1995), building on earlier work by Gomez 
ct al. (1993), who used the two point correlation func- 
tion. Several papers have confirmed Larson's finding that 
a plot of £og[N] against log[s\ — hereafter a Larson Plot 
— can be fitted with two power law sections, correspond- 
ing to two distinct regimes. At the smaller separations, 
s < Sbroak, a star's companions are mainly in binary and 
higher multiple systems, and the slope of the Larson Plot 
is ^binary = diog [N] / dtog [s] ~ —2. At larger separations, 
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Figure 2. Distribution functions for separations between randomly chosen stars in five real star clusters, (a) p Ophiuchus, (b) IC2391, 
(c) IC348, (d) Taurus, and (e) Chamaeleon. The solid line is the value of p(s) for a star cluster of type 2D0 having an infinite number 
of stars (Eqn. 4), and is included for reference. In (d) and (e), the solid line represents a smoothed version of the raw data, to show the 
existence of multiple maxima. 



s > Sbrcak, companions are simply other members of the 
overall cluster, and may only be close due to projection. The 
slope here is generally larger (i.e. still negative but smaller 
in magnitude), ^cluster = d£og[N]/d£og[s] J> — 1. Larson has 
suggested that ^cluster might be related to the fractal dimen- 
sion of the sub-clustering, D — r) c i UBtCT + 2. In addition, he 
has proposed that the break point between the two straight 
sections, at Sbrcak, might correspond to the Jeans length. 
However, recent analysis has cast some doubt on these in- 



terpretations. First, the break point is strongly influenced by 
the overall surface-density of stars (and hence by the depth 
of the cluster along the line of sight), as pointed out by Si- 
mon (1997) and Bate et al (1997). Second, fitting 7/ c i ustcr ob- 
jectively is difficult, because at the low-s end it is distorted 
by the switch to the binary regime, and — more importantly 
— at the high-s end it is distorted by edge effects. Conse- 
quently, one is left with at best a range of order 2sbrcak to 
0.1-Rciustor and the result is sensitive to how the range is 
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actually chosen; if the range is shortened or extended arbi- 
trarily, the slope of the fitted line may change dramatically. 
Third, r/ c i ustC r is not necessarily related to the fractal dimen- 
sion of the clustering. As shown by Bate et al. (1997) and 
Klessen & Kroupa (2001), it may simply reflect a large-scale 
density gradient in the cluster. 



3.2 Linear plots and edge effects 

An alternative way of evaluating the data from which Larson 
plots are derived is to calculate the distribution function 
p(s), where p(s)ds gives the probability that the projected 
separation between two cluster stars chosen at random is 
in the interval (s, s + ds). To do this empirically, we define 
«max equal s-bins in the range < s < 2_R c iustor , so that 
all the bins have width As = 2i? c iustcr/imax, and the ith bin 
corresponds to the interval (i — l)As < s < iAs , with mean 
value Si — (i — 1/2) As. fl c iuster is the overall radius of the 
cluster, and is defined by finding the mean position of all 
the stars in the cluster and then setting i? c i us t or equal to the 
distance to the furthest star. Then we count the number of 
separations Mi falling in each bin, and put 

2 Mi 



P(Si) = 



(3) 



Metal (Metal - 1) As ' 

where A/totai is the total number of stars in the cluster, and 
hence A/totai (Atotai — 1)/2 is the total number of separations. 

Figure 1(a) presents the results obtained from 100 clus- 
ters of type 2D0, i.e. a disc having statistically uniform 
surface-density. The plotted points give the mean p(si) from 
the 100 realizations, and the error bars give the width of the 
bin and the la standard deviation. If there were no edge ef- 
fects (i.e. if the uniform surface-density extended to infinity 
in two dimensions), we would have p(s) = 2s, and this is 
indeed a good fit to p(si) at small Si values, as indicated by 
the dashed line on Fig. 1(a). Departures from this straight 
line are entirely due to edge effects. 

In fact, p(s) can be calculated semi-analytically for a 
disc having uniform surface-density: 



p(s) 



where 



2s(l - s) 2 + f J l 1 _ s 6rdr , < s < 1 ; 



^ f 1 , 9rdr, 

n Js-1 ' 



1 < s < 2 ; (4) 



s > 2; 



2 i 2 

r + s - 
2rs~ 



(5) 



The solid line on Fig. 1(a) shows that this function fits the 
plotted points well, and it is included on all the other plots 
for reference, i.e. to emphasize the features which are not 
due to edge effects. 

When derived in this way, the p(s) plot contains lit- 
tle information about the distribution of binary separations, 
since they are all in the first bin. However, it seems to be 
well established that the distribution of binary separations 
is approximately scale free over a wide range of separations 
(^binary — — 2). The more critical issue — the one with which 
we are concerned here — is the distribution of separations in 
the clustering regime and what it tells us about the overall 



structure of the cluster. This information is well represented 
by p(s), as can be seen from Figs. 1(b) through 1(f), which 
show the results obtained for the other five types of non- 
fractal artificial star cluster. Figure 1(b) shows how p(s) 
is slewed towards smaller s values for a disc with a cen- 
trally concentrated surface-density, N oc r _1 . Figures 1(c) 
to 1(f) show spherical clusters having volume-density gra- 
dients nocr"" with a = 0, 1, 2, and 2.9. Again the distri- 
bution slews to smaller s values as the sphere becomes more 
centrally concentrated (i.e. with increasing a). 



3.3 The Normalised Correlation Length 

One feature which distinguishes the plots is the location 
of the maximum, i.e. the separation s max at which p(s) is 
largest. As a cluster becomes more centrally condensed, s max 
moves to smaller values, and the amplitude of the maximum 
increases. However, for an individual cluster s max will not be 
well defined, and so it is not a robust measure. 

A better measure of this trend is the Normalized Cor- 
relation Length for each cluster. The Correlation Length is 
the mean separation s between stars in the cluster, and it 
is normalized by dividing by -Rdustcr- The second column of 
Table 1 gives mean values of s and their standard deviations, 
for the various artificial cluster types. The s values for the 
five real star clusters are also given. 

The shapes of the p(s) plots, and hence also the s values, 
are independent of the number of stars in the cluster. In tri- 
als with cluster sizes of 100 to 1000 stars, s stays within one 
standard deviation of the mean value for 200 stars. This is 
at first sight surprising. A 1000-star cluster is so much more 
dense than a 100-star cluster, that one might expect the 
mean separation of stars to be smaller. However, although 
each star has more close neighbours, it also has more dis- 
tant neighbours, and the value of s remains constant. This is 
an attractive feature of the Normalised Correlation Length 
as a statistical descriptor for clusters. From Table 1, we see 
that s decreases monotonically with increasing a, and can 
therefore be used to estimate a for star clusters which are 
presumed a priori to have radial density gradients. 

Importantly, cluster types 2D1 and 3D2 are easily dis- 
tinguished by their s values and their p(s) plots, despite the 
widespread but fallacious assumption that a three dimen- 
sional cluster with volume- density n oc r~ is, when pro- 
jected on the sky, similar to a two dimensional cluster with 
surface- density N oc r _1 . In fact it is clusters of types 2D1 
and 3D1 (i.e. with the same exponent, dln[N]/d£n[r] ~ —1, 
and dln[n] / dln[r\ ~ — 1) which are hard to distinguish. 

p(s) plots for the real clusters are shown on Fig. 2. IC348 
and p Ophiuchus resemble clusters of type 3D2, both on the 
basis of their s values (Table 1), and the shapes of their p(s) 
plots (Figs. 2(a) and 2(c)). For IC2391 the s value and the 
p(s) plot (Fig. 2(b)) are most like those for clusters of type 
3D1. 



3.4 The effect of subclusters on p(s) and s 

Chamaeleon and Taurus have correlation lengths interme- 
diate between types 3D1 and 3D2, but their p(s) plots are 
clearly not generic. This is because they contain sub-clusters, 
as illustrated in Figs. 5(d) and 5(e). Consequently p(s) 
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has multiple maxima. In some cases these maxima can be 
identified with (i) separations between stars in the same 
sub-cluster (the maximum at the smallest separations) and 
(ii) separations between stars in two distinct sub-clusters 
(maxima at larger separations, corresponding to the sepa- 
ration between the two sub-clusters). If there are A4ub sub- 
clusters, there can be up to 1 + A4ub (A/" S ub — l)/2 maxima, 
but fewer if there is degeneracy in the distances between 
sub-clusters. After smoothing, the p(s) plot for Chamaeleon 
(Fig. 2(e)) shows two distinct maxima, suggesting at least 
two sub-clusters, and Fig. 5(e) does indeed show two sub- 
clusters. They are separated by ~ 1, hence giving rise to 
the maximum in p(s) at s ~ 1. However, after smooth- 
ing, the p(s) plot for Taurus (Fig. 2(d)) shows only three 
well defined maxima, suggesting at most three sub-clusters, 
whereas Fig. 5(d) shows at least eight well defined sub- 
clusters. Evidently the p(s) plot is not a robust diagnostic 
of sub-clustering. 

If we now consider artificial fractal star clusters with 
the same fractal dimension D, we find that there is so much 
variance in their individual p(s) plots that we cannot sensi- 
bly define a mean p(s) plot. However, we can still compute 
the mean Normalised Correlation Length s and its variance. 
The results are given in Table 1. We see that s increases 
monotonically with increasing D and can therefore be used 
to estimate D for star clusters which are presumed a priori 
to be fractal. 

Moreover, the value of s for star clusters of type F3.0 is 
essentially the same as for clusters of type 3D0, as it should 
be. The small difference is attributable to the fact that in 
constructing clusters of type F3.0 the positioning of the in- 
dividual stars is not completely random, whereas for type 
3D0 it is. 

However, the range of s for D in (1.5,3.0) is almost 
identical to that for a in (0,2). Therefore s is degenerate and 
cannot on its own be used to distinguish multi-scale (fractal) 
sub-clustering from a large-scale radial density gradient. 



4 MINIMAL SPANNING TREES 

The Minimal Spanning Tree (MST) is the unique* network 
of straight lines joining a set of points, such that the total 
length of all the lines - hereafter 'edges' - in the network is 
minimised and there are no closed loops. The construction 
of such a tree is described by Gower and Ross (1969). Start- 
ing at any point, an edge is created joining that point to 
its nearest neighbour. The tree is then extended by always 
constructing the shortest link between one of its nodes and 
an unconnected point, until all the points have been con- 
nected. Figure 3 shows the MSTs for the real star clusters p 
Ophiuchus, Taurus, Chamaeleon, IC348 and IC2391. 

The use of Minimal Spanning Trees (MSTs) as a probe 
of cosmological structure was explored by Barrow, Bhavsar 
and Sonoda (1985), and a further refinement, the self avoid- 
ing random walk, was described by Baugh (1993). Although 

* strictly speaking, if the array of points contains two or more 
pairs with exactly the same separation, the network may not be 
unique, as the points may be connected in a different order. How- 
ever, even if this is the case, the total length of edges and the 
distribution of edge-lengths is preserved for all solutions. 



the approach seemed promising as a means of picking out 
clumps and filaments, the only statistical analysis of the 
MST for cosmological purposes of which we are aware is due 
to Graham, Clowes and Campusano (1995), who adopted 
methods developed by Hoffman and Jain (1983) and Dussert 
et al(1987) and applied them to the distribution of quasars. 
We describe their analysis in Appendix B, and show that it 
does not work well for star clusters. 



4.1 The Normalised Mean Edge Length 

Once the MST of a cluster has been constructed it is 
straightforward to compute the Mean Edge Length, fh. Un- 
like the Normalised Correlation Length s, fh is not inde- 
pendent of the number of stars in the cluster, A/totai- As 
A/total increases, more short edges are created on the MST 
and fh decreases. The expected total length of the MST of 
a random array of A/totai points, uniformly distributed over 
a two-dimensional area A, is asymptotically proportional to 
(A/totai A) 1//2 (Hammersley et al. 1959). As there are A/totai — 1 
edges, the mean edge length is asymptotically proportional 
to (A/totaiA) 1//2 /(Atotai — 1), and so this factor should be 
used to normalise the mean edge length of clusters having 
different areas A and/or different numbers of stars A/total- 

The resulting Normalised Mean Edge Length fh has 
been evaluated for 100 realisations of each type of artificial 
star clusters, and for the real star clusters, and the results 
are recorded in Table 1 (column 3). Also recorded in Table 
1 (column 4) is the mean of the standard deviations of the 
MST edge lengths, a m , This quantity is used in Appendix B. 



4.2 Q 

Table 1 shows that for artificial clusters of type 2Da, 3Da 
and FD, both fh and s decrease monotonically as a increases 
(i.e. the degree of central concentration becomes more se- 
vere) or as D decreases (i.e. the degree of sub-clustering be- 
come more severe). However, s decreases more quickly than 
fh as a is increased, while fh decreases more quickly than s 
as D is decreased. Thus, the ratio 



yields a measure which not only quantifies, but also distin- 
guishes between, a smooth overall radial density gradient 
and multi-scale fractal sub-clustering. 

Mean values of Q for the various types of artificial star 
cluster are recorded in Table 1 (column 5). For artificial 
clusters with a smooth large-scale radial density gradient 
(type 3Da), Q increases from Q ~ 0.80 to Q ~ 1.50 as the 
degree of central concentration increases from a = (sta- 
tistically uniform number-density) to a = 2.9 (n oc r~ 2 ' 9 ). 
For artificial clusters with fractal sub-structure (type FD), 
Q decreases from Q ~ 0.80 to Q ~ 0.45 as the degree of sub- 
clustering increases from D — 3.0 (uniform number- density, 
no sub-clustering) to D — 1.5 (strong sub-clustering). 

We can therefore construct a plot (Figure 4) of D 
against Q for Q < 0.80, and a against Q for Q > 0.80. 
For any real cluster we can compute its Q value, and then 
use Figure 4 to read off its notional fractal dimension D' (if 
Q < 0.80, implying sub-clustering), or its radial density ex- 
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Figure 3. Minimal Spanning Trees for (a) p Ophiuchus, (b) IC2391, (c) IC348, (d) Taurus, and (e) Chamaeleon. 



ponent a (if Q > 0.80, implying a large-scale radial density 
gradient). 



4.3 The effect of binary companions on the MST 
Edge Length 



The small kink at Q ~ 0.8 is due to the fact in con- 
structing a cluster of type F3.0 the stars are positioned reg- 
ularly (in the sense that at each generation, each subcube 
of space is occupied) and therefore the number-density is 
artificially uniform; in contrast, when we construct a cluster 
of type 3D0 the stars are positioned randomly, so that the 
density is only uniform in a statistical sense and there are 
Poisson fluctuations in the local density. 



Fractal dimensions obtained from Fig. 4 in this way are 
only notional, because Q (or any other single measure) can 
reflect sub-clustering, but cannot capture whether the sub- 
clustering is hierarchically self-similar. 



Using Figure 4, we infer that Taurus, IC2391 and 
Chamaeleon have substructure with notional fractal dimen- 
sions D = 1.5, 2.2 and 2.25. In contrast, p Ophiuchus and 
IC348 appear to be centrally concentrated, with radial den- 
sity exponents a = 1.2 and 2.2 . These inferences agree well 
with an intuitive reading of the raw data shown in Fig. 5. 



The MST will normally link a star to its binary companion, 
as this will usually be the shortest way of adding one or 
other of the stars to the tree. Binaries create very short 
edges and therefore a large population of binary stars will 
cause a noticeable reduction in the mean edge length, m. 
Of the five real clusters considered in this paper, Taurus 
has been subjected to particularly close scrutiny and has 
a larger identified population of binaries than any of the 
others. As the binaries are not part of the clustering regime, 
it is important to establish whether they are distorting the 
result. 

Using the MST, all pairs of stars lying closer together 
than 10 -4 of the cluster radius were pruned, leaving sin- 
gle stars. For Chamaeleon, p Ophiuchus, IC348 and IC2391, 
only 3, 0, 1 and 2 such pairs were found; Taurus, however, 
was pruned from 215 down to only 137 primary stars. For 
the pruned version of Taurus, s increased from 0.55 to 0.57, 
while m increased from 0.26 to 0.33 and Q increased from 
0.47 to 0.58. Removal of the binaries resulted in the notional 
fractal dimension for Taurus being increased from 1.5 to 1.9. 
This demonstrates that in a cluster with a large binary pop- 
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Figure 4. Q plot for artificial star clusters. For Q < 0.80, the fractal dimension D should be read from the lefthand axis, and for 
Q > 0.80, the radial density exponent a should be read from the righthand axis. The small kink at Q ~ 0.8 is explained in the text. 



Table 2. Sources of positions for cluster members and approxi- 
mate ages and crossing times for clusters. (Crossing times were 
calculated using a typical velocity dispersion of 2 km/sec.) 



Name 


Members 


Age 


Tcross 


Sources 






Myr 


Myr 




IC2391 


166 


53 


2.5 


Barrado et al. (2001) 


Cham. 


136 


0.1-40 


2.7 


Lawson ct al. (1996) 










Ghez et al. (1997) 


Taurus 


215 


1.0 


10.0 


Briceno ct al. (1993) 










Ghez et al. (1993) 










Gomez ct al. (1992) 










Hartmann et al. (1991) 










Herbig et al. (1988) 










Leinert ct al. (1993) 










Simon ct al. (1995) 










Waer ct al. (1988) 










Luhman ct al. (2003) 


p Oph. 


199 


0.3 - 2.0 


1.35 


Bontemps et al. (2001) 


IC348 


288 


2.0 


2.0 


Luhman et al. (2003) 



ulation, it is important to prune the close companions before 
evaluating s, m and Q. 



5 DISCUSSION 

The ratio of the Normalized Mean Edge Length to the Nor- 
malized Correlation Length, Q, is effective in distinguishing 
between a smooth large-scale radial density gradient and 



multi-scale fractal sub-clustering, because it is sensitive not 
only to the frequency of small separations between stars, but 
also to their spatial distribution. 

The MST Edge length m is a simple average of the 
distances of stars to their (usually) closest neighbours. If one 
star is moved very close to another, the change in rh will be 
diluted by the total number of stars, A/totai • However, when 
calculating the mean distance of companions for all stars, 
only one star out of A/totai in the cluster has had one of its 
A/totai — 1 companions moved very close. The change in s, is 
therefore diluted by A/ t o tal . Thus, when small separations are 
scattered all over the cluster, increasing the number of small 
separations causes both rh and s to decrease,but s decreases 
more slowly than rh. This is the case for decreasing fractal 
dimension. 

For radially concentrated clusters, by contrast, increas- 
ing the clustering creates more small separations between 
stars, but these are all in the central region of the cluster. 
Moving another star to this area affects rh in the normal 
way, the star having a newly short edge length between it 
and its nearest neighbour, and the change in the mean dis- 
tance being diluted by A/total- However, the large number of 
other stars in the centre also gain another close neighbour. 
The decrease in s is therefore compounded and exceeds that 
in rh. 

Consequently, the quotient Q — rh/s successfully dis- 
tinguishes between clusters which have a smooth large-scale 
radial density gradient and clusters which have multi-scale 
fractal sub-clustering, in a way which agrees with an in- 
tuititive analysis but which cannot be accomplished using 
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Figure 5. Raw data for all real star clusters analysed in the paper. The clusters have been centred on the mean position of all stars and 
scaled so that the distance from the centre to the most distant star is unity. 



existing methods such as Larson Plots or Box Dimension 
Plots. An additional advantage over these methods is that 
the calculation of Q is quantitative and objective, as no in- 
tervention is required in the normalisation process, in the 
construction of the MST, or in choosing a range over which 
to calculate a slope. 

We should emphasize that classical methods for evalu- 
ating the density profile of a cluster, or its fractal dimension, 
are not viable for clusters with ~ 200 members, primarily 
because of low-number statistics. For example, if one at- 
tempts to define the projected radial density profile for a 
real cluster of stars by counting stars in different annuli, the 
result is very noisy. 

Alternatively, if one attempts to determine the mean 
projected radial density profile for a 200-member artificial 
cluster having a given radial density profile in three dimen- 
sions, using many different realizations and with a view to 
comparing this with a real cluster, one finds that the stan- 
dard deviation is very large, and so the diagnostic power of 
this profile is poor. 

In the same spirit one might attempt to construct the 
Box Dimension Plot (BDP) of a real cluster and compare 
it with the mean BDP of artificial star clusters having a 
given fractal dimension. To construct a BDP one divides 
the projected image of the star cluster into a grid of square 
cells of side I and counts the number of cells, Af OC c (I) which 



are occupied by stars. Then, by repeating this for different 
values of I, one obtains a plot of log(J\f OC c(l)) against —log(l). 
For a true fractal this plot is a straight line with slope equal 
to the fractal dimension. However, for a star cluster with 
only ~ 200 members, the plot is not linear. By treating 
many realisations of artificial clusters all having the same 
fractal dimension and the same number of stars, one can 
define a mean BDP. However, the mean BDP is not very 
strongly dependent on the fractal dimension and it has a 
large standard deviation. Therefore the Box Dimension Plot 
of a real cluster does not give a useful constraint on its fractal 
dimension. 

It is for this reason that we have sought integral mea- 
sures of cluster structure. The same philosophy informs the 
use of equivalent width when evaluating noisy spectral lines 
(for example). 

We also note that a cluster cannot have a large-scale 
radial density gradient, and at the same time be fractally 
sub-clustered. A cluster could have a large-scale radial den- 
sity gradient and non-fractal sub-clustering - but then it 
would require more parameters to characterize the struc- 
ture, and its diagnosis would become correspondingly more 
difficult (if not impossible for clusters with ~ 200 stars). 

In Table 2 we list estimates for the ages and the crossing 
times of the clusters we have analyzed. On the basis of simple 
arguments, we might expect the Q value of a cluster to in- 
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Figure 6. MST (m, <r m )-plots. (a) (m, <r m )-plane, showing the regions of the plane in which well characterised distributions of points 
converge (from Graham et al). *: random distributions, 1: clustered structures, 2: concentration gradients, 3: quasi periodic tilings, 4: 
highly organised distributions, (b) (m, <r m )-plane, with artificial star clusters of types 3D0 to 3D2.9 and F1.5 to F3.0 plotted. The five 
real clusters are indicated by symbols * : Taurus, o : p Ophiuchus, X : IC2391, □ : IC348, A : Chamaeleon. 



crease with time, as the substructure dissolves and the over- 
all cluster relaxes to a radially concentrated density profile. 
However, this is not evident in the small sample treated here. 
Taurus has an age much less than its crossing time, which is 
consistent with its small Q value and low fractal dimension. 
On the other hand, IC2391 and Chamaeleon have ages much 
greater than their crossing times and yet they are still frac- 
tal with relatively low Q values. In contrast, p Ophiuchus 
and IC348, which have ages comparable with their cross- 
ing times, are both centrally condensed, with no discernible 
substructure. We should, however, caution against drawing 
firm conclusions from such a small sample. We also note that 
young clusters observed at short wavelengths (i.e. in the op- 
tical), may appear to have substructure due to patchy ob- 
servation, therefore long wavelength surveys are preferable 
for embedded young star clusters. 



6 CONCLUSIONS 

We have explored two statistical measures for analysing ob- 
jectively the observed (i.e. projected) structures of star clus- 
ters. These measures are based on the Mean Surface Density 
of Companions (MSDC), and the Minimal Spanning Tree 
(MST). The measures are s, the normalised mean separation 
between stars, and m, the normalised mean edge-length of 



the MST, both of which are independent of the number of 
stars in the cluster. For artificial star clusters, created with 
a smooth large-scale radial density profile (n oc r~ a ), and 
for artificial star clusters created with sub-structure having 
fractal dimension D, s and rh both decrease with increas- 
ing a and/or decreasing D - but at different rates. Hence a 
cluster with a radial gradient can be distinguished from one 
with sub-structure by evaluating Q = rh/s. For a cluster of 
uniform volume-density (i.e. a — and D — 3.0), Q ~ 0.80. 
If the cluster is made more centrally condensed by increasing 
a, Q increases monotonically, reaching Q~1.50ata = 2.9. 
Conversely, if the cluster is given sub-structure by reduc- 
ing D, Q decreases monotonically, reaching Q ~ 0.45 at 
D = 1.5. 



On the basis of their Q values, p Ophiuchus and IC348 
have radial gradients with a ~ 1.2 ± 0.3, and 2.2 ± 0.2, re- 
spectively. Chamaeleon and IC2391 have sub-structure with 
notional fractal dimension D' ~ 2.2 ± 0.2. Taurus has even 
more sub-structure, with D' ~ 1.55 ± 0.25, and if the bina- 
ries in Taurus are treated as single systems, D' increases to 
1.9 ± 0.2. D' is only a notional fractal dimension, because 
the integral measures we have defined do not give any in- 
dication of whether the sub-structure is hierarchically self- 
similar. (Indeed, for clusters having only ~ 200 stars the 
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range of separations is too small to possess hierarchical self- 
similarity.) 
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APPENDIX B: DETECTING RANDOMNESS 
AND CLUSTERING USING THE MINIMAL 
SPANNING TREE. 

Graham et al (1995) have applied the MST to quasar clus- 
tering on very large scales, using a method which was de- 
veloped by Dussert et al (1987) for characterising biologi- 
cal structures. In Dussert's method, the mean fh and stan- 
dard deviation <j m of the edge lengths of the MST arc 
first computed and normalised by dividing by the factor 
(A/totai^) 1 ' 2 /(Motai-l), then plotted on the (m, a m )-planc. 
Fig. 6(a), reproduced from Dussert et al. (1987), shows the 
theoretical locations on the (m, a m )-plane for different types 
of clustering in two-dimensions. The region of the (m,a m )- 
plane around the central star represents the locus of a ran- 
dom distribution. The region of the ( 771, (7 m )-planc around 
'1' represents the locus of distributions dominated by sub- 
clustering. The region of the (m, a m )-plane around '2' repre- 
sents the locus of distributions dominated by radial concen- 
tration gradients. The region of the (m, a m )-plane around 
'3' represents the locus of distributions dominated by quasi- 
periodic tilings. And the region of the (m, a m )-plane around 
'4' represents the locus of highly organized distributions (i.e. 
lattices) . 

Fig. 6(b) shows the loci on the (m, <r m )-plane for the 
various artificial star cluster types and the five real clus- 
ters, and reveals some drawbacks to this plot. The locus for 
artificial clusters with a radial density gradient do indeed 
tend towards region 2 with increasing a (i.e. greater de- 
gree of central concentration), although only for a 2 are 
they clearly distinguishable from a purely random distribu- 
tion. Similarly, the locus for artificial clusters with fractal 
sub-clustering tend towards region 1 with decreasing D (i.e. 
greater degree of sub-clustering) for D 2.0. However, for 
D 2.0, this trend is abandoned, and the locus moves to- 
wards region 2; in other words, a cluster with a low fractal 
dimension and hence a high degree of sub-clustering mas- 
querades - on the (m, a m )-plane - as a cluster with a strong 
radial density gradient, albeit it not precisely of the form 
n oc r~ a . Moreover, Taurus, which to the human eye ap- 
pears to have the most well defined sub-clustering of all five 
real clusters, masquerades on the (m, <r m )-plane as a cluster 
with a strong radial density gradient, a ~ 2.7. 

We conclude that the (m, a m )-plane is not able to dis- 
tinguish between a smooth large-scale radial density gradi- 
ent and multi-scale fractal sub-clustering 



APPENDIX A: RAW DATA 

Table 2 gives the sources of the positions of stars — or, in 
the case of p Ophiuchus, protostars — used in the analysis 
of Sections 3 and 4. These positions are plotted on Fig. 5. 
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